OPTIMIZING DEFRAGMENT ATION OPERATIONS IN A 
DIFFERENTIAL SNAPSHOTTER 

CROSS-REFERENCE TO RELlVTED APPLICATIONS 
This invention claims priority based on U.S. Provisional Patent Application Serial 
No. 60/419,252, filed on October 16, 2002, which is hereby incorporated in its entirety by 
reference. 

FIELD OF THE INVENTION 
The present invention relates generally to data storage, and more particularly to 
snapshots of file system volumes. 

BACKGROUND OF THE INVENTION 
Data storage is an essential feature of computer systems. Such storage typically 
includes persistent data stored on block-addressable magnetic disks and other secondary 
storage media. Persistent data storage exists at several levels of abstraction, ranging from 
higher levels that are closer to the logical view of data seen by users running application 
programs, to lower levels that are closer to the underlying hardware that physically 
implements the storage. At a higher, logical level, data is most commonly stored as files 
residing in volumes or partitions, which are associated with one or more hard disks. The 
file system, which can be regarded as a component of the operating system executing on 
the computer, provides the interface between application programs and nonvolatile 
storage media, mapping the logically meaningful collection of data blocks in a file to 
their corresponding physical allocation units, or extents, located on a storage medium. 
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such as clusters or sectors on a magnetic disk. 

Users and administrators of computer systems benefit from having the ability to 
recover earlier versions of files stored on the system. Users may accidentally delete or 
erroneously modify files. An administrator of a system that has become corrupted may 
5 wish to recover the entire state of a file system at some known good time before the 
corruption occurred. The underlying disk hardware can fail. A snapshot is one 
technique for facilitating the recovery of earlier versions of files. 

A snapshot of a volume is a virtual volume representing a point in time on the 
original volume. Some snapshotters capture the point-in-time data by mirroring the 

10 entire contents of the volume in its snapshot state. By contrast, differential snapshotters 
do not make actual copies at the time of the snapshot. Rather, changes to the original 
volume are carefiilly monitored so that the virtual volume (i.e., the snapshot) can always 
be produced. A differential snapshotter will copy a block in the volume only if it is 
modified after the snapshot is taken; such a copy operation is called a "copy-on-write." 

1 5 The snapshot state of the volume can be reconstructed by using these copies of changed 
blocks along with the unchanged blocks in the original volume. In the usual case, many 
files in the volume will be left unchanged following the snapshot, so differential 
snapshotters provide a more economical design than nondifferential approaches. As 
many changes occur to the original volume, however, a differential snapshotter must keep 

20 a large area of disk space to hold the older versions of the disk blocks being changed. 

In most operating systems, the extents that make up the physical allocation units 
implementing a particular file may be discontiguous, as may the pool of allocation units 
available as logically free space for use in future file space allocation. A disk volume in 
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such a state is said to be externally fragmented. In many such operating systems, a 
volume can be expected to suffer from increasing external fragmentation over time as 
files are added, deleted and modified. Extemal fragmentation increases the time 
necessary to read and write data in files, because the read/write heads of the hard disk 
5 drive will have to increase their lateral movement to locate information that has become 
spread over many non-contiguous sectors. If fragmentation is sufficiently severe, it can 
lead to significantly degraded performance and response time in the operation of the 
computer system. 

Defragmentation utility programs provide an important remedy for data storage 
10 systems that are prone to extemal fragmentation. These utilities can be periodically run 
to rearrange the physical location of a volume*s file extents so that contiguity of 
allocation blocks is increased and disk read/write access time is correspondingly reduced, 
improving performance. A defragmentation operation consists of moving some blocks in 
a file to a location that is free on the volume. More precisely, the contents of one block 
15 are copied to the free block location. The old location of the block becomes free and the 
new location of the block becomes occupied space. The defragmentation of a volume 
will typically involve an extensive number of such block moves. 

Although users of file systems benefit from the disk speed optimizations achieved 
by defragmentation, the benefit has come at the expense of efficient use of differential 
20 snapshotters. If a volxmie is defragmented subsequent to the taking of a snapshot, the 
snapshotter will ensure that each data block relocation by the defragmenter is preceded 
by a copy-on-write of the block. The logical view of the original volume is unchanged 
by the defragmentation operations, but because the disk blocks on which the disk is 
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physically manifested change drastically in content, the amount of space needed to 
maintain the snapshot explodes. This disk space explosion may be enough to destroy a 
principal reason for using differential snapshotters in the first place, that of disk space 
economy. 

5 The problem seen in the interaction between differential snapshotters and 

defragmentation operations is that, prior to the present invention, differential snapshotters 
have not been able to distinguish logically significant writes of blocks from logically 
insignificant block moves, treating both as requiring copy-on- write protection. This 
problem is particularly acute when there is a volume defragmentation operation on the 

10 original volume, but those of skill in the art will appreciate that other file-manipulating 
programs besides defragmenters may require the nonlogical relocation or shuffling of file 
blocks. For example, a program might, for performance reasons, create a file of a 
particular size and arrange the blocks in a desired way before proceeding with further use 
of the file for writing data. Prior to the present invention, differential snapshotters have 

1 5 treated such block rearrangements as requiring copy-on-write protection. 

It can be seen, then, that there is a need for an improvement in differential 
snapshotters so that logically insignificant moves of blocks from one volume location to 
another are recognized as not requiring copy-on-write protection in principle. The 
availability of more efficient differential snapshotters will make more likely the use of 

20 snapshots applied on a longer-term basis for data recovery. Moreover, such an 

improvement will lead to greater use of defragmentation utilities and therefore will allow 
disk speed optimizations to take place while having snapshots with little performance 
impact and little disk space consumed. 
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SUMMARY OF THE INVENTION 
The present invention provides a method for capturing and maintaining a 
differential snapshot of an original volume in which logically significant modifications of 
5 blocks, which require copy-on-write protection, are distinguished from logically 

insignificant block moves, which in principle do not need to be preceded by copy-on- 
write operations. The invention involves the use of a file system with the ability to pass a 
BLOCK_COP Y command down to lower-level, block-oriented drivers, a capacity not 
available in previous file systems, which enables such drivers to take advantage of 

10 hardware acceleration for data block movements. In particular, a snapshot driver, 

informed by the file system that^ requested operation is a nonlogical block move, uses 
this enrichment in knowledge to avoid unnecessary copy-on-write operations. Instead, 
the snapshotter simply updates the translation table data structures it employs to keep 
track of which blocks must be protected by copy-on-write operations and where the 

1 5 snapshot versions of blocks are being stored. 

Those skilled in the art will readily perceive that the present invention is also 
applicable to differential snapshots of files and volumes contained on block devices other 
than magnetic disk media and to the use of differential snapshotters to reconstruct time- 
defined versions of other persistent data structures. Other aspects and advantages of the 

20 invention will become apparent from the following detailed description, taken in 
conjunction with the accompanying drawings, illustrating by way of example the 
principles of the invention. 
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BRIEF DESCRIPTION OF THE DRAWINGS 
FIG. 1 is a flow diagram illustrating the steps taken in an embodiment of the 
invention with respect to a block move from a block location A to a block location B; 
FIG. 2 is a flow diagram illustrating the steps taken under two scenarios in an 
5 embodiment of the invention following the block move from A to B of FIG. 1 in the case 
where, before the move, the snapshotter bitmap bit for block B is set and the bitmap bit 
for block A is clear, with FIG. 2A illustrating the scenario where A is written, and with 
FIG. 2B illustrating the scenario where there is a write to B; 

FIG. 3 is a flow diagram illustrating the steps taken in an embodiment of the 
10 invention following the block move from A to B of FIG. 1 in the case where, before the 
move, the bitmap bit for-block B is set and the bitmap bit for block A is clear, and where, 
after the move, a write of B has not yet occurred and a move of block B to a block 
location C is initiated; 

FIG. 4 illustrates one possible computer in the context of which an embodiment of 
1 5 the present invention may be practiced; 

FIG. 5 illustrates an exemplary multi-level secondary storage system associated 
with a computer, such as the computer of FIG. 4, in the context of which an embodiment 
of the present invention may be practiced; 

FIG. 6 is a diagram presenting a detailed example of the handling of a logically 
20 significant block write in an embodiment of the invention, with FIG. 6A providing the 
view before the write and FIG. 6B providing the view after the write; 

FIG. 7 is a diagram presenting a detailed example of the handling of a simple 
logically insignificant block move in an embodiment of the invention, with FIG. 7A 
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providing the view before the block move and FIG. 7B providing the view after the 
move; 

FIG. 8 is a diagram continuing the detailed example of FIG. 7, presenting the 
handling of two logically significant block write requests in an embodiment of the 
5 invention, including a write at a block location from which a data block was nonlogically 
moved, and a write at the block location to which that block was moved, with FIG. 8A 
providing the view before the writes and FIG. 8B providing the view after the writes; 

FIG. 9 is a diagram continuing the detailed example of FIG. 7, presenting the 
handling of a second logically insignificant block move following the first move depicted 
10 in FIG. 7, with FIG. 9A providing the view before the block move and FIG. 9B providing 
the view after the move; 

FIG. 10 is a diagram continuing the detailed example of FIG. 9, presenting the 
handling of a third logically insignificant block move following the second move 
depicted in FIG. 9, where the move is to the original block location as presented in FIG. 
1 5 7, with FIG. lOA providing the view before the block move and FIG. lOB providing the 
view after the move; 

FIG. 11 is a flow diagram presenting a high-level view of the steps taken in an 
embodiment of the invention with respect to capturing and maintaining the snapshot; 

FIG. 12 is a flow diagram presenting the steps taken in an embodiment of the 
20 invention with respect to the handling of a logically significant write request; 

FIG. 13 is a flow diagram presenting the steps taken in an embodiment of the 
invention with respect to the handling of a logically insignificant request to move a block; 
and 
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FIG. 14 is a flow diagram presenting the steps taken in an embodiment of the 
invention with respect to the handling of a request to read a block in the virtual volume 
corresponding to the snapshot of the original volume. 
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DETAILED DESCRIPTION OF THE INVENTION 
A differential snapshotter does not have to perform any copy-on-write operations 
on disk space that was logically unused at the time of the snapshot. This is true because 
the disk blocks that are free on that snapshot will never need to be read when the 
5 snapshotter produces a logical volume file or directory. For this reason, a differential 
snapshotter may have a bitmap of the blocks on the volume. It may set the bit to one bit 
value, such as 1, for blocks that are free at the time that the snapshot was taken, and it 
may set to the same value the bits corresponding to blocks that have already had a copy- 
on-write since the time of the snapshot. Clearly, only bits that have the other bit value (0 

10 if the first bit value is 1) need to have their blocks copied-on- write. (In the 

..accompanying drawings it is assumed that the first bit value, which may be called an 
"ignore" value, is 1 and that the second bit value, which may be called a "protect" value, 
is 0. However, the invention is of course equally applicable to embodiments which use 0 
as the "ignore" value and 1 as the "protect" value.) 

15 A defragmentation operation consists of moving some blocks in a file to a 

location that is free on the volume. The old location of the block becomes free and the 
new location of the block becomes occupied. Therefore, it suffices for a differential 
snapshotter in accordance with the invention to be informed that a block is moving from 
A to B so that it can change its view of what is free space and what is occupied space 

20 without performing any copy-on-write operations but instead simply updating a 
translation table. 

FIGS. 1-3 illustrate details of an embodiment of the invention in handling a block 
move from block A to block B. Tuming to FIG. 1 , the procedure begins at step 1 1 . The 
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differential snapshotter is informed that a block is moving from A to B by way of a 
BLOCK COPY command passed down by the file system (step 13), rather than a 
READ^BLOCK followed by a WRITE_BLOCK. This tells the differential snapshotter 
what operation is taking place. The differential volume snapshotter keeps a bitmap of 
5 one bit for every block, where the bit being set indicates that the snapshotter does not 
need to take any action when it is written. A clear bit indicates that the snapshotter has to 
take the copy-on-write. The snapshotter keeps a translation table of (Block # — ► Device, 
Block #) to support reading the snapshot. 

If the B bit is clear (step 15), then the snapshotter will copy-on-write the B block 

10 (step 17) before it is written by the move operation (step 19) so that there is an entry in 
the table for the B block (step 21) and the B bit is set in the bitmap (step 23). 

If the B bit is set in the bitmap, there may or may not be an entry in the table for 
the B block. If B is free space at the time of the snapshot then there is no entry in the 
table. If the A bit is set (step 27), then the move operation writes B (step 29) and the 

15 snapshotter is done (step 25). There is no point in doing anything if changes to A can be 
ignored. 

At this point we have reduced this problem to the case where the bit for block B is 
set and the bit for block A is clear. Now we let the move happen (step 29) and then 
change the bits to the A bit being set (step 31) and the B bit being clear (step 33). We 
20 add two entries to the translation table: (A SameDevice, B) (step 35) and (B -»> A) 
(step 37) where the ->>> symbol is used to denote that B originally comes from A. The 
second type of entry provides for fast lookup and, in an embodiment of the invention, it 
may be used within the same table data structure as the first type of entry with no extra 
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overhead. Those of skill in the art will recognize that the two kinds of table entry may 
equivalently be kept in two tables, and that reverse lookup may equivalently be 
performed in a translation table using only the first type of table entry. 

FIG. 2 continues the illustration of FIG. 1 where, originally, the bit for block B 
was set and the bit for block A was clear, presenting the steps taken by the snapshotter 
with respect to a subsequent write of block A in FIG. 2 A and a subsequent write of block 
B in FIG. 2B. In FIG. 2A, following the completion of the steps illustrated in FIG. 1 (step 
41), henceforth A can be written freely (steps 43, 45), as its bit is set. In FIG. 2B, 
following the completion of the steps illustrated in FIG. 1 (step 51), a command to write 
to B in step 53 will result in a copy-on-write of B (step 55) followed by the write (step 
57). The copy-on-write of B will then be added to the. table in place of the previous entry 
(A SameDevice, B), yielding (A DiffAreaVolume, DiffAreaVolumeOffset) (step 
59), the deletion of the (B -»> A) entry (step 61), and the setting of the B bit (step 63). 
DiffAreaVolume and DiffAreaVolumeOffset represent the differential storage space 
volume device and block number, respectively, to which block B is copied. 

FIG. 3 continues the illustration of FIG. 1 where, originally, the bit for block B 
was set and the bit for block A was clear, the steps associated with the move from A to B 
have occurred (through step 37 of FIG. 1), and a subsequent write of B has not yet 
occurred (step 69). In step 71, a move of block B to block C is initiated. The rules 
presented in FIG. 1 then apply, with block B now the old location (corresponding to 
block A in FIG. 1) and block C the new location (corresponding to block B in FIG. 1). 
The B bit is clear (from step 33 in FIG. 1). If the C bit is clear (step 73), then the 
snapshotter will copy-on-write the C block (step 75) before it is written by the move 
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operation (step 77) so that there is an entry in the table for the C block (step 79) and the C 
bit is set in the bitmap (step 81). 

If, prior to the move, the C bit is set, we let the move happen (step 85) and then 
change the bits to the B bit being set (step 87) and the C bit being clear (step 89). 
5 However, in preparing to insert (B — > SameDevice, C) to the translation table, we find the 
(B -»> A) table entry in place. At this point, the snapshotter effects a composition, 
yielding the entries (A — ► SameDevice, C) (step 91) and (C ->>> A) (step 93), which 
would replace (A B) and (B -»> A) (steps 95, 97). 

FIGS. 4-14 illustrate aspects of embodiments of the invention in further detail. 

10 FIG. 4 illustrates one exemplary computing envirormient 100 within which the present 
invention may be performed. The environment 100 includes a general-purpose stored- 
program computer machine 110, which may be connected to one or more other computer- 
based resources, such as a remote computer 180 connected to the computer device 1 10 by 
a local area network 171 or wide area network 173. The computer machine 110 includes 

15 at least one central processing unit 120 connected by a system bus 121 to a primary 
memory 130. One or more levels of a cache 122, connected to or situated within the 
processing unit 120, act as a buffer for the primary memory 130. Programs, comprising 
sets of instructions for the machine 1 10, are stored in the memory 130, from which they 
can be retrieved and executed by the processing unit 120. In the course of executing 

20 program instructions, the processing unit 120 retrieves data 137 stored in the memory 130 
when necessary. Among the programs and program modules stored in the memory 130 
are those that comprise an operating system 134. 

The exemplary computer machine 110 further includes various input/output 
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devices and media for writing to and reading from the memory 130, including secondary 
storage devices such as a non-removable magnetic hard disk 141, a removable magnetic 
disk 1 52, and a removable optical disk 156. Such computer-readable media provide 
nonvolatile storage of computer-executable instructions and data; the hard disk 141 is 
also commonly used along with the primary memory 130 in providing virtual memory. It 
will be appreciated by those skilled in the art that other types of computer-readable media 
that can provide volatile and nonvolatile storage of data accessible by a computer may 
also be used in the exemplary computer environment 100. The computer 1 10 has a file 
system 142 associated vAih the operating system 134. The file system 142 serves as an 
interface that maps a set of logically-organized named files to data physically stored on 
secondary media, such as data stored in clusters or sectors on the hard disk 141. 

The diagram of FIG. 5 illustrates an exemplary multi-level secondary storage 
system associated with a computer such as the computer depicted in FIG. 4, in the 
context of which an embodiment of the invention may be practiced. A differential 
snapshotter 21 1 may be regarded as a driver that mediates between the file system 207 
and a block driver 215. The block driver 215 provides sector-level access to data 
contained in volumes 221, 225 corresponding to hard disks 219, 223. The snapshotter 
21 1 accesses data at the sector level through the block driver 215. Executing programs 
201, 205, such as a disk defragmentation utility 203, access stored data at a higher, 
logical level through the file system interface 207. 

The differential snapshotter 21 1 is directed to take a snapshot 217 of an original 
disk volume 221 at a specified point in time. The snapshot is a virtual volume 217 
containing the versions of files in the volume 221 as they existed at the time of the 
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snapshot. Initially, no copying of data in the original volume 221 is done by the 
differential snapshotter 211. After the time of the snapshot, the snapshotter 21 1 monitors 
and intercepts efforts by the file system 207 to access data blocks in the original volume 
221 on behalf of executing programs 201, 203, 205. If the file system 207 attempts to 
write new data to a block, the snapshotter 21 1 first consults a bitmap 209 to determine 
whether it must preserve the data in that block with a copy-on-write operation before the 
write attempt can proceed. If a copy-on-write is necessary, the snapshotter 211 writes the 
copy to a special differential storage area 227, possibly stored in another volume 225 on 
another disk 223, recording information identifying the copied block and the location in 
which it was copied in one or more table data structures 213. 

In embodiments of the invention, the file system 207 has the capacity to pass a 
BLOCK_COPY command to lower-level drivers, enabling lower-level drivers to take 
advantage of hardware acceleration for data block copies. In particular, the file system 
can pass the BLOCK_COPY command down to the snapshot driver 21 1 to request a 
logically insignificant relocation of a block from one block location to another in the 
volume 221 . Having received the BLOCK_COPY request, which signifies that the 
requested data movement is not logically significant, the snapshotter 21 1 may be able to 
avoid performing a copy-on-write by using the bitmap 209 and tables 213 in a manner 
described in further detail below. 

The snapshotter 211 also enables the file system 207 to read snapshot versions of 
files. To the file system 207 the snapshot virtual volume 217 appears to be another block 
device, which the file system 207 can moimt. If a requested file that was in the original 
volume at the time of the snapshot has been logically changed or nonlogically moved 
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since the time of the snapshot, the snapshotter 21 1, consulting its tables 213, will redirect 
the read request to the appropriate location in the differential storage space 227 or in the 
original volume 221 where that snapshot version is stored. 

As mentioned above, a bitmap 209 is used by the snapshotter 21 1 to determine 
5 whether a particular block location must be protected by a copy-on-write operation. In 
the bitmap 209, a particular bit represents a particular block in the volume 221 . When the 
snapshot is captured, a subset of the blocks in the volume 221 will be logically occupied, 
in the sense that they are at that moment being used to implement existing files. Another 
subset of blocks will constitute logically free space. In the initial configuration of the 

10 bitmap 209, all occupied-space blocks will have their corresponding bits set to "protect," 
and all free-space blocks will have their bits set to "ignore," because there is no reason to 
perform a copy-on- write for a block that was logically insignificant at the time of the 
snapshot. In the embodiment illustrated in the examples of FIGS. 1-3 above and in the 
examples discussed below, the "ignore" value is 1 and the "protect" value is 0. It should 

1 5 be noted that once a copy-on-write is performed for a particular block, it is no longer 
necessary for the snapshotter 21 1 to protect that block. 

Referring now to FIG. 6, the depicted example illustrates how the snapshotter 
handles the straightforward case of a logically significant request to write a block 
location. In FIG. 6A, the snapshotter has intercepted a WRJTE_BLOCK call 301 from 

20 the file system, which seeks to write data 303 at the block location here designated COS 
307. The bit 317 in the bitmap 319 corresponding to this block is 0, so the block 307 
must be protected with a copy-on-write operation 311 copying its data to differential 
storage space 3 1 3 located on a volume 315. FIG. 6B presents the view after the copy-on- 
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write has taken place and after the write of block COS 323 has been permitted to go 
forward. The bit 343 corresponding to this block 323 is set to 1, since no further 
protection of the snapshot version of this block will be needed. The copy-on-write has 
been made at location DOl 341 in the differential storage space 33 1 . A table data 
5 structure 333, mapping blocks 327 to the location 329 at which the snapshot versions of 
those blocks are stored, records the fact that block COS 335 has been copied to 
differential location DOl 337. 

Referring now to FIG. 7, the depicted example shows the simplest case involving 
a logically insignificant block move, such as that which might be requested by the file 

10 system during the execution of a disk defragmentation operation following the time of the 
snapshot. The example illustrates how a copy-on-write operation is avoided in such a . . 
situation without any loss of information regarding the contents and location of the 
snapshot version of the protected block. FIG. 7A represents the situation after the request 
is intercepted but before it is permitted to proceed. The snapshotter is made aware of the 

15 nonlogical nature of the requested operation by the file system's use of a BLOCK COPY 
call 405, in accordance with the invention, instead of READ_BLOCK and 
WRITE_BLOCK calls. Here the request involves the relocation of the data in block COS 
407 to block COS 409 in the same volume 401 . In the bitmap 403, the bit 41 3 
corresponding to block COS 407 is 0, so some effort must be made to preserve the data in 

20 this block 407 as the snapshot version of block COS 407. The bit 415 corresponding to 
the destination block 409 is set to 1, as might be expected if the requested move is a 
defragmentation operation selecting a current firee-space location in the volume 401 as 
the new location for the block data being moved. If the bitmap bits 413, 415 
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corresponding to blocks C03 407 and COS 409 in FIG. 4A had been other than 0 and 1, 
respectively, the snapshotter would have handled the BLOCK COPY request 405 
differently. This will be explained below in the discussion of the flow diagram of FIG. 
13. 

As a consequence of the requested block move, a logically occupied block, which 
is one of the blocks that must be protected by the snapshotter, becomes free space, and a 
free-space block becomes occupied space. This change can be reflected in the bitmap 
simply by exchanging the bit values 411 in the two bits 413,415 corresponding to the 
two blocks 407, 409 involved in the move. FIG. 7B depicts the situation after the block 
move has taken place. Block COS 425 now holds the data that was previously held in 
block COS 421, and the corresponding bits 423, 427 in the bitmap 419 have been . 
switched. The relocation of the snapshot version of block COS 435 to block COS 437 is 
recorded in the table 429. The mapping here is a translation to another offset in the 
volume 417. If the snapshotter receives a request to read the snapshot version of block 
COi, it will look up COS 435 in the table 429 and find that the snapshot copy is currently 
located at COS 437. The read request will be directed to block COS 425. 

Referring now to FIG. 8, the depicted example proceeds from the state of FIG. 
7B. In FIG. 8A, two logically significant WRITE^BLOCK requests 551, 553 are 
received for the respective block locations COS 507 and COS 509, the same locations that 
were involved in the preceding logically insignificant move. The request 551 to write 
block COS 507 will be allowed by the snapshotter without further action, since its 
corresponding bit 513 in the bitmap 503 is set to 1, indicating that it can be written freely. 
The bit 515 corresponding to block COS 509, however, is 0, so it must be protected with a 
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copy-on-write before it can be written. FIG. 5B illustrates the situation following the 
writes. Blocks €03 521 and COS 525 now hold the new data. The bitmap bit 523 
corresponding to block COS 521 remains 1, of course. The bit 527 corresponding to 
block COS 525 is set to 1 following the copy-on-write 543 depicted in FIG. 8A. The 
5 copy-on-write 543 copied the old value of COS 509, which is the snapshot version of 
current block COS 521, in location D02 547 in the differential storage space 549. In the 
storage/translation table 529, the mapping 537 for block COS 535 is updated accordingly, 
recording D02 541 as the current location of the snapshot block COS 539. 

Although the diagrams of FIGS. 6-10 show a single mapping table for illustrative 

10 simplicity, an additional reverse mapping table may be used. This reverse mapping table 
. may be stored as part of the same data structure as the direct-mapping translation table, as 
in the flow diagrams of FIGS. 1-3, or, in the alternative, it may be maintained as a 
separate data structure. A reverse mapping table entry provides, for fast lookup, the 
mapping from a first block in the original volume to a second block in the same volume, 

15 the second block signifying the location whose snapshot version the first block is holding. 
In the example of FIG. 8, the snapshotter looks up COS in the reverse mapping table, 
finding COS mapped to COS, the block location of COS's data at the time of the snapshot. 

While the case of FIGS. 7 and 8 is one in which there was ultimately no net 
benefit in the original avoidance of a copy-on-write, in general it is impossible to predict 

20 whether there will be a logically significant write to a block that has previously been the 
subject of a logically insignificant move. In the case of a block move pursuant to a 
defi-agmentation operation, it is particularly likely that the benefit of avoiding the copy- 
on-write will be preserved, since the defragmentation of an entire volume of blocks will 
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involve many moves, only a small number of which can be expected to be the subject of 
subsequent logical writes. 

Referring now to FIG. 9, the example depicted therein proceeds from the state of 
FIG. 7B and illustrates how the snapshotter handles the move of a previously-moved 
5 block. In FIG. 9A, the snapshotter intercepts a file system BLOCK_COPY command 
605 for a logically insignificant move 643 from block COS 609 to block CIO 607, in 
accordance with the invention. The bitmap bits 615, 613 for these blocks are 0 and 1 
respectively, as in the example of FIG. 7, and again the bits 615, 613 will be exchanged 
645 in order to update the bitmap 603 to reflect the changed block configuration. The 

10 snapshotter looks up COS 637 in the reverse mapping table corresponding to the depicted 
table 629, finding the reverse mapping to COS 635, signifying that block COS 609 is the 
current location of the snapshot version of block COS 635. As shown in FIG. 9B, 
representing the state after the data previously stored in block COS 625 has been moved 
to C70 653, the table 655 is updated so that COS 647 is mapped compositionally to CIO 

15 649 rather than to COS 641. The bits 627, 651 corresponding to blocks COS 625 and CIO 
653 respectively have been exchanged, with CIO's bit 651 now having the protect value 
0. 

Referring now to FIG. 10, the example of FIG. 9 is continued in FIG. lOA, with a 
file system attempt 715 to nonlogically move the data in block CIO 709 to block COS 
20 705, using the BLOCK_COPY conmiand 713 in accordance with the invention. The 
move destination 705 is also the snapshot-time location of data currently stored in CIO 
709. The bitmap bits 71 1, 707 corresponding to blocks CIO 709 and COS 705 are 0 and 1 
respectively, and the bits are exchanged 717, as seen in FIG. lOB following the move, 
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where Clffs bit 743 is now 1 and COi's bit 741 is 0, as in the original bitmap 703. A 
lookup of CI 0 731 in the reverse mapping table corresponding to the depicted table 719 
reveals CIO 731 to be the current location of the snapshot version of block COS 727. The 
appropriate update to the table 745 is the entry 761 , 755 mapping COS to COS, but this is 
5 a cycle that may simply be removed from the table. Thus, with respect to block COS 735, 
the snapshot-time status quo has been restored. 

The algorithms applied in the previous examples are presented in further detail in 
the flow diagrams of FIGS. 1 1-14. FIG. 1 1 represents a procedural overview of an 
embodiment of the invention. At step 800 the procedure is begun. In step 802 the 

10 snapshotter captures a snapshot of an original disk volume at a point in time, following 
which, in step 804, it creates the associated bitmap, initially assigning 1 (the "ignore" 
value) to logically free blocks and 0 (the "protect" value) to logically occupied blocks. In 
step 806 the snapshotter assumes the role of monitoring file system requests to access 
blocks in the original volume, as well as the role of enabling the file system to read the 

1 5 snapshot virtual volume. The method relating to the snapshot of step 802 terminates in 
step 808. 

FIGS. 12-14 expand upon the post-snapshot step 806 of FIG. 11. These diagrams, 
like the flow diagrams of FIGS. 1-3, assume that the snapshotter maintains one 
translation table holding up to two mappings for each original volume block entry a. One 
20 mapping, denoted a-^b, signifies that block b currently stores the snapshot copy of a. A 
second mapping, denoted a -»> c, the reverse mapping referred to above, signifies that 
block a currently stores the snapshot copy of c. 

The flow diagram of FIG. 12 presents the steps associated with the interception of 
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a logically significant WRITE_BLOCK from the file system. Following the entry into 
the procedure (step 900), in step 902 the snapshotter detects an effort by the file system to 
logically write block k in the original volume. In step 904, the snapshotter checks the 
value of the corresponding bit in the bitmap. If this bit is 1, the file system write can 
5 proceed (step 914) and the snapshotter exits the procedure (step 916). If the bit is 0, the 
block data must be protected. A copy-on-write operation copies the block to a 
differential storage location d (step 906), and the bit corresponding to the copied block is 
set to 1 (step 908), permitting subsequent accesses of the block to be ignored. 

In step 910 the snapshotter determines whether there is an entry k -»> j in the 

10 table, reverse-mapping k to some block j in the original volume. If so, block k is the 

current location of the snapshot version of blocks". The snapshotter removes this reverse 
mapping (step 918) and the corresponding direct mapping^' k from the table (step 
920). It makes a new table entry j ^d, recording differential storage location d as the 
current location of the snapshot version of ; (step 922). At step 914 the file system is 

15 permitted to write block k, and the snapshotter then exits (step 916). If, however, there 
was no reverse-mapping entry for k in the table, the snapshotter makes an entry k^ d in 
the table (step 912). Block k can then be written by the file system (step 914), and the 
algorithm terminates (step 916). 

The flow diagram of FIG. 13 presents the steps associated with the interception of 

20 a file system attempt to nonlogically move a block of data from one block location j to 
another block location k in the volimie. The snapshotter enters the procedure (step 1000) 
and receives the move request (step 1002). The bitmap bits for the source and destination 
blocks are examined respectively in steps 1004 and 1006. If the bit corresponding to 
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block y is 1 , or if the bit corresponding to block k is 0, the snapshotter will treat the 
request as a READ_BLOCK on j to be followed by a WRITE_BLOGK on k using the 
data stored in j (step 1007). To handle the WRITE_BLOCK on k, the snapshotter follows 
the procedure outlined in FIG. 12 (step 1009). 
5 If the bit corresponding to j is 0 and the bit corresponding to A: is 1 , the 

optimization associated with the invention can be realized. The snapshotter determines 
whether there is a reverse-mapping entry j -»> i in the table mappings to some block / 
in the same volume (step 1008). If so, 7 is currently storing the snapshot version of block 
/. The direct-mapping table entry / -> j is deleted (step 1010), and the corresponding 

10 reverse-mapping table entry j -»> / is deleted (step 1012). If / and k are not the same 
block location, determined at step 1014, a direct-mapping entry / ^ A: is added to the 
table (step 1016), as is the corresponding reverse mapping k -»> i (step 1018). These 
two steps are skipped if / and k are the same. In either case, the bits corresponding to j 
and k are swapped (step 1024), the block move is allowed to proceed (step 1040), and the 

15 procedure terminates (step 1042), the block move having been achieved without a copy- 
on-write operation. 

Finally, the flow diagram of FIG. 14 presents the steps taken by the snapshotter 
in enabling the file system to read the virtual snapshot volume. The procedure begins at 
step 1 100, and at step 1 102 a file system request to read a particular block v in the 
20 snapshot volume is received. The snapshotter determines whether there is an entry 

V -> w in the table (step 1 104). If such an entry exists, it signifies that the snapshot copy 
of block V is stored at another location w, either in the same volume or in the differential 
storage space. The snapshotter directs the file system read to w (step 1 106), and the 
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procedure terminates (step 1110). If there is no entry for v in the table, the snapshot copy 
of block V is the same as the current contents of block v in the original volume. The 
snapshotter therefore directs the read to the actual block v (step 1 108), and the procedure 
terminates (step 1110). 

The foregoing detailed description discloses a method for capturing and 
maintaining a differential snapshot in which logically significant writes of data blocks are 
distinguished from logically insignificant moves of block data. The ability of the 
snapshotter to make this distinction is accomplished by an innovation in the file system 
whereby a BLOCK COP Y conraiand can be passed to drivers below the file system 
level, which also enables those drivers to take advantage of hardware acceleration of data 
block copies. With respect to, the differential snapshotter, substantial economies of 
processing time and storage space are achieved. While, as those skilled in the art will 
readily recognize, the invention is susceptible to various modifications and alternative 
constructions, certain illustrative embodiments have been shown in the accompanying 
drawings and have been described above in detail. It should be understood, however, that 
there is no intention to limit the invention to the specific forms disclosed. On the 
contrary, the intention is to cover all modifications, alternative constructions, and 
equivalents falling within the spirit and scope of the invention. 
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